A survey of modern authorship attribution methods

نویسنده

  • Efstathios Stamatatos
چکیده

Authorship attribution supported by statistical or computational methods has a long history starting from 19th century and marked by the seminal study of Mosteller and Wallace (1964) on the authorship of the disputed Federalist Papers. During the last decade, this scientific field has been developed substantially taking advantage of research advances in areas such as machine learning, information retrieval, and natural language processing. The plethora of available electronic texts (e.g., e-mail messages, online forum messages, blogs, source code, etc.) indicates a wide variety of applications of this technology provided it is able to handle short and noisy text from multiple candidate authors. In this paper, a survey of recent advances of the automated approaches to attributing authorship is presented examining their characteristics for both text representation and text classification. The focus of this survey is on computational requirements and settings rather than linguistic or literary issues. We also discuss evaluation methodologies and criteria for authorship attribution studies and list open questions that will attract future work in this area.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Enhancing Authorship Attribution By Utilizing Syntax Tree Profiles

The aim of modern authorship attribution approaches is to analyze known authors and to assign authorships to previously unseen and unlabeled text documents based on various features. In this paper we present a novel feature to enhance current attribution methods by analyzing the grammar of authors. To extract the feature, a syntax tree of each sentence of a document is calculated, which is then...

متن کامل

A Survey on Authorship Analysis

The paper discusses about the problem of Authorship analysis, different types of authorship analysis’s such as authorship attribution, authorship identification, authorship profiling, plagiarism detection. It also addresses the issues in Indian language text. Keywords— Authorship attribution, authorship profiling, plagiarism detection, text classification.

متن کامل

Proving and Improving Authorship Attribution Technologies

Who wrote Primary Colors? Can a computer help us make that call? Despite a century of research, statistical and computational methods for authorship attribution are neither reliable, well-regarded, widely-used, or wellunderstood. This paper presents a survey of the current state-of-the-art as well as a framework for uniform and unified development of a tool to apply the state-of-the-art, despit...

متن کامل

Function Words in Authorship Attribution. From Black Magic to Theory?

This position paper focuses on the use of function words in computational authorship attribution. Although recently there have been multiple successful applications of authorship attribution, the field is not particularly good at the explication of methods and theoretical issues, which might eventually compromise the acceptance of new research results in the traditional humanities community. I ...

متن کامل

A Prototype for Authorship Attribution Studies

Despite a century of research, statistical and computational methods for authorship attribution are neither reliable, well-regarded, widely-used, or well-understood. This paper presents a survey of the current state-ofthe-art as well as a framework for uniform and unified development of a tool to apply the state-of-the-art, despite the wide variety of methods and techniques used. The usefulness...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JASIST

دوره 60  شماره 

صفحات  -

تاریخ انتشار 2009